IEICE global.ieice.org Site

Author Search Result

[Author] Michihiro KOIBUCHI(25hit)

21-25hit(25hit)

A Link Removal Methodology for Application-Specific Networks-on-Chip on FPGAs
Daihan WANG Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO

PAPER-VLSI Systems

Vol:
E92-D No:4
Page(s):
575-583
The regular 2-D mesh topology has been utilized for most of Network-on-Chips (NoCs) on FPGAs. Spatially biased traffic generated in some applications makes a customization method for removing links more efficient, since some links become low utilization. In this paper, a link removal strategy that customizes the router in NoC is proposed for reconfigurable systems in order to minimize the required hardware amount. Based on the pre-analyzed traffic information, links on which the communication amount is small are removed to reduce the hardware cost while maintaining adequate performance. Two policies are proposed to avoid deadlocks and they outperform up*/down* routing, which is a representative deadlock-free routing on irregular topology. In the case of the image recognition application susan, the proposed method can save 30% of the hardware amount without performance degradation.
A Generalized Theory Based on the Turn Model for Deadlock-Free Irregular Networks
Ryuta KAWANO Ryota YASUDO Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO

PAPER-Computer System

Pubricized:
2019/10/08
Vol:
E103-D No:1
Page(s):
101-110
Recently proposed irregular networks can reduce the latency for both on-chip and off-chip systems with a large number of computing nodes and thus can improve the performance of parallel applications. However, these networks usually suffer from deadlocks in routing packets when using a naive minimal path routing algorithm. To solve this problem, we focus attention on a lately proposed theory that generalizes the turn model to maintain the network performance with deadlock-freedom. The theorems remain a challenge of applying themselves to arbitrary topologies including fully irregular networks. In this paper, we advance the theorems to completely general ones. Moreover, we provide a feasible implementation of a deadlock-free routing method based on our advanced theorem. Experimental results show that the routing method based on our proposed theorem can improve the network throughput by up to 138 % compared to a conventional deterministic minimal routing method. Moreover, when utilized as the escape path in Duato's protocol, it can improve the throughput by up to 26.3 % compared with the conventional up*/down* routing.
Multi-Voltage Variable Pipeline Routers with the Same Clock Frequency for Low-Power Network-on-Chips Systems
Akram BEN AHMED Hiroki MATSUTANI Michihiro KOIBUCHI Kimiyoshi USAMI Hideharu AMANO

PAPER

Vol:
E99-C No:8
Page(s):
909-917
In this paper, the Multi-voltage (multi-Vdd) variable pipeline router is proposed to reduce the power consumption of Network-on-Chips (NoCs) designed for Chip Multi-processors (CMPs). The multi-Vdd variable pipeline router adjusts its pipeline depth (i.e., communication latency) and supply voltage level in response to the applied workload. Unlike Dynamic Voltage and Frequency Scaling (DVFS) routers, the operating frequency remains the same for all routers throughout the CMP; thus, omitting the need to synchronize neighboring routers working at different frequencies. Two types of router architectures are presented: a Coarse-Grained Variable Pipeline (CG-VP) router that changes the voltage supplied to the entire router, and a Fine-Grained Variable Pipeline (FG-VP) router that uses a finer power partition. The evaluation results showed that the CG-VP and FG-VP routers achieve a 22.9% and 35.3% power reduction on average with 14% and 23% area overhead in comparison with a baseline router without variable pipelines, respectively. Thanks to the adopted look-ahead mechanism to switch the supply voltage, the performance overhead is only 4.4%.
MMLRU Selection Function: A Simple and Efficient Output Selection Function in Adaptive Routing
Michihiro KOIBUCHI Akiya JOURAKU Hideharu AMANO

PAPER-Computer Systems

Vol:
E88-D No:1
Page(s):
109-118
Adaptive routing algorithms, which dynamically select the route of a packet, have been widely studied for interconnection networks in massively parallel computers. An output selection function (OSF), which decides the output channel when some legal channels are free, is essential for an adaptive routing. In this paper, we propose a simple and efficient OSF called minimal multiplexed and least-recently-used (MMLRU). The MMLRU selection function has the following simple strategies for distributing the traffic: 1) each router locally grasps the congestion information by the utilization ratio of its own physical channels; 2) it is divided into the two selection steps, the choice from available physical channels and the choice from available virtual channels. The MMLRU selection function can be used on any type of network topology and adaptive routing algorithm. Simulation results show that the MMLRU selection function improves throughput and latency especially when the number of dimension becomes larger or the number of nodes per dimension become larger.
Improving the Performance of Circuit-Switched Interconnection Network for a Multi-FPGA System
Kohei ITO Kensuke IIZUKA Kazuei HIRONAKA Yao HU Michihiro KOIBUCHI Hideharu AMANO

PAPER

Pubricized:
2021/08/05
Vol:
E104-D No:12
Page(s):
2029-2039
Multi-FPGA systems have gained attention because of their high performance and power efficiency. A multi-FPGA system called Flow-in-Cloud (FiC) is currently being developed as an accelerator of multi-access edge computing (MEC). FiC consists of multiple mid-range FPGAs tightly connected by high-speed serial links. Since time-critical jobs are assumed in MEC, a circuit-switched network with static time-division multiplexing (STDM) switches has been implemented on FiC. This paper investigates techniques of enhancing the interconnection performance of FiC. Unlike switching fabrics for Network on Chips or parallel machines, economical multi-FPGA systems, such as FiC, use Xilinx Aurora IP and FireFly cables with multiple lanes. We adopted the link aggregation and the slot distribution for using multiple lanes. To mitigate the bottleneck between an STDM switch and user logic, we also propose a multi-ejection STDM switch. We evaluated various combinations of our techniques by using three practical applications on an FiC prototype with 24 boards. When the number of slots is large and transferred data size is small, the slot distribution was sometimes more effective, while the link aggregation was superior for other most cases. Our multi-ejection STDM switch mitigated the bottleneck in ejection ports and successfully reduced the number of time slots. As a result, by combining the link aggregation and multi-ejection STDM switch, communication performance improved up to 7.50 times with few additional resources. Although the performance of the fast Fourier transform with the highest communication ratio could not be enhanced by using multiple boards when a lane was used, 1.99 times performance improvement was achieved by using 8 boards with four lanes and our multi-ejection switch compared with a board.

21-25hit(25hit)

Author Search Result

[Author] Michihiro KOIBUCHI(25hit)

A Link Removal Methodology for Application-Specific Networks-on-Chip on FPGAs

A Generalized Theory Based on the Turn Model for Deadlock-Free Irregular Networks

Multi-Voltage Variable Pipeline Routers with the Same Clock Frequency for Low-Power Network-on-Chips Systems

MMLRU Selection Function: A Simple and Efficient Output Selection Function in Adaptive Routing

Improving the Performance of Circuit-Switched Interconnection Network for a Multi-FPGA System

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles